Comparing performance of k-Nearest Neighbors, Parzen Windows and SVM Machine Learning Classifiers on QSAR Biodegradation Data across Multiple Dimensions
نویسنده
چکیده
Machine learning and pattern recognition are the most popular artificial intelligence techniques to model systems, those can learn from data. These techniques efficiently help in Classification, Regression, Clustering and Anomaly detection etc. k-Nearest Neighbors, Parzen Windows and Support Vector Machine (SVM) are some of the widely used Machine Learning classification techniques. This project aims to experimentally compare several features of these classification techniques using QSAR Biodegradable dataset over different dimensions using Principal Component Analysis (PCA). The results of the experiment demonstrate that SVM performs way better than k-Nearest Neighbor and Parzen window and, k-Nearest Neighbor performs a little better over Parzen window on classification accuracy. Also, I establish that ERBF performs better than other kernel functions (RBF, Polynomial and Linear) when used for SVM.
منابع مشابه
Finite Sample Error Bound for Parzen Windows
Parzen Windows as a nonparametric method has been applied to a variety of density estimation as well as classification problems. Similar to nearest neighbor methods, Parzen Windows does not involve learning. While it converges to true but unknown probability densities in the asymptotic limit, there is a lack of theoretical analysis on its performance with finite samples. In this paper we establ...
متن کاملLearning Nearest-Neighbor Classifiers with Hyperkernels
We consider improving the performance of k-Nearest Neighbor classifiers. A regularized kNN is proposed to learn an optimal dissimilarity function to substitute the Euclidean metric. The learning process employs hyperkernels and shares a similar regularization framework as support vector machines (SVM). Its performance is shown to be consistently better than kNN, and is competitive with SVM.
متن کاملA Brief Review of Classifiers used in OCR Applications
The performance of a recognition system depends upon the classifiers used for classification purpose. Powerful is the discrimination ability of a classifier, better is its recognition performance. The generalization ability of a classifier is measured on the basis of its performance in classifying the test patterns. There are various factors which affect generalization. Moreover, the feature ex...
متن کاملDiagnosis of Breast Cancer Subtypes using the Selection of Effective Genes from Microarray Data
Introduction: Early diagnosis of breast cancer and the identification of effective genes are important issues in the treatment and survival of the patients. Gene expression data obtained using DNA microarray in combination with machine learning algorithms can provide new and intelligent methods for diagnosis of breast cancer. Methods: Data on the expression of 9216 genes from 84 patients across...
متن کاملA comparison between k-Optimum Path Forest and k-Nearest Neighbors supervised classifiers
This paper presents the k-Optimum Path Forest (k-OPF) supervised classifier, which is a natural extension of the OPF classifier. k-OPF is compared to the k-Nearest Neighbors (k-NN), Support Vector Machine (SVM) and Decision Tree (DT) classifiers, and we see that k-OPF and k-NN have many similarities. This work shows that the k-OPF is equivalent to the k-NN classifier when all training samples a...
متن کامل